Goto

Collaborating Authors

 initial data


A Proof of Proposition

Neural Information Processing Systems

In this appendix we prove Proposition 1 from Section 4. Proposition 1. We next derive two lemmas that will be used in the proofs of our theorems. Hence we select the most under-sampled action if we take!1 in Algorithm 1. Lemma 2. Let s be a state that we visit m times. The proof follows from Lemma 1. The proof is by induction.


From Initial Data to Boundary Layers: Neural Networks for Nonlinear Hyperbolic Conservation Laws

Ciril, Igor, Haddaoui, Khalil, Tendero, Yohann

arXiv.org Artificial Intelligence

Abstract--We address the approximation of entropy solutions to initial-boundary value problems for nonlinear strictly hyperbolic conservation laws using neural networks. A general and systematic framework is introduced for the design of efficient and reliable learning algorithms, combining fast convergence during training with accurate predictions. The methodology that relies on solving a certain relaxed related problem is assessed through a series of one-dimensional scalar test cases. These numerical experiments demonstrate the potential of the methodology developed in this paper and its applicability to more complex industrial scenarios. Nonlinear hyperbolic conservation laws play a central role in the mathematical modeling of physical systems where transport and wave propagation phenomena dominate.


A Proof of Proposition

Neural Information Processing Systems

In this appendix we prove Proposition 1 from Section 4. Proposition 1. We next derive two lemmas that will be used in the proofs of our theorems. Hence we select the most under-sampled action if we take!1 in Algorithm 1. Lemma 2. Let s be a state that we visit m times. The proof follows from Lemma 1. The proof is by induction.


On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs)

Neural Information Processing Systems

We investigate the statistical and computational limits of latent Diffusion Transformers (DiTs) under the low-dimensional linear latent space assumption. Statistically, we study the universal approximation and sample complexity of the DiTs score function, as well as the distribution recovery property of the initial data. Specifically, under mild data assumptions, we derive an approximation error bound for the score network of latent DiTs, which is sub-linear in the latent space dimension. Additionally, we derive the corresponding sample complexity bound and show that the data distribution generated from the estimated score function converges toward a proximate area of the original one.Computationally, we characterize the hardness of both forward inference and backward computation of latent DiTs, assuming the Strong Exponential Time Hypothesis (SETH). For forward inference, we identify efficient criteria for all possible latent DiTs inference algorithms and showcase our theory by pushing the efficiency toward almost-linear time inference.


Derivation of effective gradient flow equations and dynamical truncation of training data in Deep Learning

Chen, Thomas

arXiv.org Machine Learning

We derive explicit equations governing the cumulative biases and weights in Deep Learning with ReLU activation function, based on gradient descent for the Euclidean cost in the input layer, and under the assumption that the weights are, in a precise sense, adapted to the coordinate system distinguished by the activations. We show that gradient descent corresponds to a dynamical process in the input layer, whereby clusters of data are progressively reduced in complexity ("truncated") at an exponential rate that increases with the number of data points that have already been truncated. We provide a detailed discussion of several types of solutions to the gradient flow equations. A main motivation for this work is to shed light on the interpretability question in supervised learning.


YanTian: An Application Platform for AI Global Weather Forecasting Models

Cheng, Wencong, Xia, Jiangjiang, Qu, Chang, Wang, Zhigang, Zeng, Xinyi, Huang, Fang, Li, Tianye

arXiv.org Artificial Intelligence

To promote the practical application of AI Global Weather Forecasting Models (AIGWFM), we have developed an adaptable application platform named 'YanTian'. This platform enhances existing open-source AIGWFM with a suite of capability-enhancing modules and is constructed by a "loosely coupled" plug-in architecture. The goal of 'YanTian' is to address the limitations of current open-source AIGWFM in operational application, including improving local forecast accuracy, providing spatial high-resolution forecasts, increasing density of forecast intervals, and generating diverse products with the provision of AIGC capabilities. 'YianTian' also provides a simple, visualized user interface, allowing meteorologists easily access both basic and extended capabilities of the platform by simply configuring the platform UI. Users do not need to possess the complex artificial intelligence knowledge and the coding techniques. Additionally, 'YianTian' can be deployed on a PC with GPUs. We hope 'YianTian' can facilitate the operational widespread adoption of AIGWFMs.


Physics-informed active learning for accelerating quantum chemical simulations

Hou, Yi-Fan, Zhang, Lina, Zhang, Quanhao, Ge, Fuchun, Dral, Pavlo O.

arXiv.org Artificial Intelligence

Quantum chemical simulations can be greatly accelerated by constructing machine learning potentials, which is often done using active learning (AL). The usefulness of the constructed potentials is often limited by the high effort required and their insufficient robustness in the simulations. Here we introduce the end-to-end AL for constructing robust data-efficient potentials with affordable investment of time and resources and minimum human interference. Our AL protocol is based on the physics-informed sampling of training points, automatic selection of initial data, uncertainty quantification, and convergence monitoring. The versatility of this protocol is shown in our implementation of quasi-classical molecular dynamics for simulating vibrational spectra, conformer search of a key biochemical molecule, and timeresolved mechanism of the Diels-Alder reactions. These investigations took us days instead of weeks of pure quantum chemical calculations on a high-performance computing cluster. Introduction The introduction of machine learning potentials (MLPs) pushed the boundaries of what was previously possible in molecular dynamics (MD). MLPs enable simulations of longer time scales and larger systems with higher accuracy.


On the Effects of Heterogeneous Errors on Multi-fidelity Bayesian Optimization

Foumani, Zahra Zanjani, Yousefpour, Amin, Shishehbor, Mehdi, Bostanabad, Ramin

arXiv.org Machine Learning

Bayesian optimization (BO) is a sequential optimization strategy that is increasingly employed in a wide range of areas including materials design. In real world applications, acquiring high-fidelity (HF) data through physical experiments or HF simulations is the major cost component of BO. To alleviate this bottleneck, multi-fidelity (MF) methods are used to forgo the sole reliance on the expensive HF data and reduce the sampling costs by querying inexpensive low-fidelity (LF) sources whose data are correlated with HF samples. However, existing multi-fidelity BO (MFBO) methods operate under the following two assumptions that rarely hold in practical applications: (1) LF sources provide data that are well correlated with the HF data on a global scale, and (2) a single random process can model the noise in the fused data. These assumptions dramatically reduce the performance of MFBO when LF sources are only locally correlated with the HF source or when the noise variance varies across the data sources. In this paper, we dispense with these incorrect assumptions by proposing an MF emulation method that (1) learns a noise model for each data source, and (2) enables MFBO to leverage highly biased LF sources which are only locally correlated with the HF source. We illustrate the performance of our method through analytical examples and engineering problems on materials design.


Learning to Shape by Grinding: Cutting-surface-aware Model-based Reinforcement Learning

Hachimine, Takumi, Morimoto, Jun, Matsubara, Takamitsu

arXiv.org Artificial Intelligence

Object shaping by grinding is a crucial industrial process in which a rotating grinding belt removes material. Object-shape transition models are essential to achieving automation by robots; however, learning such a complex model that depends on process conditions is challenging because it requires a significant amount of data, and the irreversible nature of the removal process makes data collection expensive. This paper proposes a cutting-surface-aware Model-Based Reinforcement Learning (MBRL) method for robotic grinding. Our method employs a cutting-surface-aware model as the object's shape transition model, which in turn is composed of a geometric cutting model and a cutting-surface-deviation model, based on the assumption that the robot action can specify the cutting surface made by the tool. Furthermore, according to the grinding resistance theory, the cutting-surface-deviation model does not require raw shape information, making the model's dimensions smaller and easier to learn than a naive shape transition model directly mapping the shapes. Through evaluation and comparison by simulation and real robot experiments, we confirm that our MBRL method can achieve high data efficiency for learning object shaping by grinding and also provide generalization capability for initial and target shapes that differ from the training data.


Deep neural networks from the perspective of ergodic theory

Zhang, Fan

arXiv.org Artificial Intelligence

The Hessian at critical points have many eigenvalues in high parameter dimensions, andit ismorelikelythat theytakeonboth positive Artificial neural networks have demonstrated great potential and negative values giving us saddle points, so we are in their ability to learn existing knowledge, and interpolate less likely to be trapped, and could instead slip out along or even slightly extrapolate to new situations.